Method and apparatus for identifying a substance using a spectral library database

ABSTRACT

A spectroscopic detector for identifying the presence of a first substance in the presence of another substance includes a laser for illuminating the substances at a plurality of wavelengths to induce the emission of radiation characteristic of the substance; a spectrometer for measuring the emitted radiation to obtain a plurality of spectral measurement data; and a processor for processing the data. An algorithm combines the data into a composite spectrum and a parameter characteristic of the first substance is identified while information in the composite spectrum contributed by emission of radiation from the other substance is removed to identify the presence of the first substance and obtain a characteristic spectral signature of the first substance. The signature is compared to signatures in a spectral library database, wherein at least some of the library signatures have spectral characteristics differentiated from each other by identifiable spectral characteristics caused by environmental factors.

CROSS REFERENCE TO RELATED APPLICATIONS

The present application claims the benefit of the priority filing dateof provisional patent applications No. 60/535,179, filed Jan. 7, 2004,and No. 60/601,180, filed Aug. 13, 2004, both incorporated herein byreference.

The present application is related to U.S. Ser. No. ______ , entitled“METHOD AND APPARATUS FOR IDENTIFYING A SUBSTANCE”, filed concurrentlyherewith.

FIELD OF THE INVENTION

This invention relates to a method and apparatus for identifying asubstance in the presence of other substances. More particularly, theinvention relates to a method and apparatus for identifying chemical,biological or other constituents of interest in the presence of othersubstances by comparing spectral signatures in a spectral Librarydatabase.

BACKGROUND OF THE INVENTION

Spectroscopic identification of bio-organisms utilizing resonance ornear-resonance-Raman spectroscopy, described in U.S. Pat. No. 4,847,198,incorporated herein by reference, is a method through which biologicalorganisms are identified from the highly structured emission spectraresonantly excited by illumination with Deep UltraViolet, ˜0.2-0.3micron, (DUV) radiation. FIG. 1 illustrates the prior art systemdescribed in U.S. Pat. No. 4,847,198. A light source 12 comprises alaser 14, e.g. a Nd-Yag device, producing high energy light pulses at1064 nm, 532 nm, 355 nm, and 266 nm; a dye laser 16, e.g. a Quanta-RayPDL-2 which shifts pulse energies from Yag frequencies to lowerenergies; and a wavelength extender 18 that either doubles the dye laseroutput or mixes the dye-laser output or doubled dye laser output with anNd-Yag fundamental to produce pulsed UV light at a wavelength between350-216 nm. The output from the wavelength extender 18 strikes a splitprism 20 which produces two beams. A first reference beam strikes amirror and is reflected onto a photodiode 22. The output from thephotodiode is transmitted to a Princeton Applied Research Model 162Boxcar Averager 24. A Spex Datamate DMO1 microcomputer 26 controls thestepping motor (not illustrated) of a monochromator 40, for general dataacquisition and disc storage of spectra. The second beam from the prism20 strikes a mirror 28 which directs the beam to a sample 30 underinvestigation. The energy backscattered from the sample is collimated bya lens 32, condensed by an optically aligned lens 34 and focused by thelens 34 on an entrance slit of the monochromator 40. In this manner, asingle wavelength in the UV or DUV range illuminates the sample andbackscattered energy, i.e. resonance or near-resonance enhanced Ramanscattering, or Raman scattering from a microorganism with acharacteristic spectrum or “fingerprint”, is collected.

FIGS. 2 a and b show the highly structured spectra of identifiablebiological organisms resonantly excited by illumination of the sample tobe examined with deep ultraviolet, ˜0.2-0.3 micron, (DUV) light. FIG. 2a shows spectra from Pseudomonas fluorescens (top), E. Coli (second fromtop), Bacillus subtilis (third from top), and Staphylococcus epidermidis(bottom) illuminated by a single DUV wavelength. The spectra contain afew large peaks. Prior-art authors attempted to use the locations of thelarge peaks for identification. We observe that the peaks have differentshapes, that the spectra are very structured and visibly different foreach organism observed. With proper analysis techniques, therefore, theentire spectrum can be used to make an identification. FIG. 2 b showsspectra from B.megaterium spores illuminated at widely separated timesby 4 different DUV wavelengths. Prior-art authors note that differentillumination wavelengths produce major peaks at similar locations. We,however, observe that each individual illumination wavelength produces aspectrum that differs in features other than the major peak location,thus adding to the information that comprises the organism's signature.The spectra originate from resonant and near-resonant interactions ofthe illuminating DUV light with chemical bonds within and among nucleicand amino acids that constitute more than 50% (by dry weight) of theorganism's mass. Hence, the spectra constitute a (partial) fingerprintof the organism. Because the light is in the DUV region the bondinteraction is near-resonant, the Raman scattering is enhanced withsignal-to-noise ratios of 10³-10⁴ being typical. In previous studies,spectra, with signal-to-noise sufficient for analysis, from as few as 20organisms in a clean environment measured in 15 seconds have beendemonstrated. Very importantly for biological measurements, interferencefrom broad-band fluorescence in the DUV region of the spectrum wherethis method operates is virtually non-existent. The illuminating lightneed not damage the sample, allowing confirmation of positive readingsthrough immediately repeated measurements by the same instrument. Thesample can also be saved for forensic examination by other techniques ata later time. In addition, the spectra have been shown to containinformation about the organism's stage of development and otherinformation useful for assessing the threat posed by the organism.

This prior art technique has very limited ability to identify species.The ability to distinguish gram+from gram−bacteria has beendemonstrated, but more specific identification has not been possible.Even this crude level of identification has been demonstrated in puresamples only, not when the substance of interest is present along withother substances. Excitation at a single wavelength may excite not justthe substance of interest but also some or all of the other substancespresent, spectrally masking its signature and making it difficult tointerpret the emitted spectra and to identify the substance of interest.

Other approaches have applied spectral data processing algorithms todata resulting from a single illumination wavelength of pure samples buthave demonstrated only a limited capability to distinguish betweenspectral “fingerprints”, that is, the ability to identify a signature ofa particular species, organism, or substance. None have been successfulidentifying an organism in the presence of other substances and/ororganisms. Some require obtaining sets of training data and do nottherefore lend themselves effectively to real time processing needs.Others exhibit limited ability to identify organisms due to the inherentlimitations of their spectral data processing methodology.

Other current identification technologies, such as PCR, require apre-enrichment step—i.e. a step in which the organism to be identifiedis grown for hours or days to provide the large number of organismsrequired for the identification method to be effective.

There is a need for a substance detector that can identify substances inthe presence of other substances, to do so rapidly, and to do so withhigh sensitivity, and specificity, for example, identifying a minimalnumber of an organism in the presence of other organisms.

SUMMARY OF THE INVENTION

According to the invention, a spectroscopic detector for identifying thepresence of a first substance in the presence of at least one othersubstance includes a laser for illuminating the first substance and theat least one other substance with electromagnetic radiation at aplurality of wavelengths to thereby induce the emission of radiationcharacteristic of the substance; a spectrometer (or photometer) formeasuring the emitted radiation at a plurality of emission wavelengthsto obtain a plurality of spectral measurement data; and a processor forprocessing the spectral measurement data. The processor includes aprocessing algorithm configured for combining the plurality of spectralmeasurement data into a composite spectrum; applying the algorithm tothe composite spectrum whereby at least one parameter characteristic ofthe first substance is identified while information in the compositespectrum contributed by an emission of radiation from the at least oneother substance is removed to thereby identify the presence of the firstsubstance, and thereby obtain a characteristic spectral signature of thefirst substance; and comparing the spectral signature of the firstsubstance to spectral signatures of substances in a spectral librarydatabase, wherein at least some of the library spectral signatures havespectral characteristics differentiated from each other by identifiablespectral characteristics caused by one or more environmental factors.

Also according to the invention, a method of identifying the presence ofa first substance in the presence of at least one other substanceincludes illuminating the first substance and the at least one othersubstance with electromagnetic radiation of one or more wavelengths tothereby induce the emission of radiation characteristic of thesubstances being illuminated; measuring the emitted radiation to obtaina plurality of spectral measurement data; and inputting the spectralmeasurement data into the processor, the processor then combining theplurality of spectral measurement data into a composite spectrum,applying the algorithm to the composite spectrum to identify theparameter characteristic of the first substance and remove theinformation contributed by the one or more other substances that may bepresent to obtain its characteristic spectral signature, and comparingit to those in the Library that as discussed above is differentiablebased on spectral characteristics caused by one or more environmentalfactors.

In these embodiments, the invention overcomes the prior art limitationsnoted above by acquiring spectra, preferably resonant and near-resonantRaman Spectra, that are more complete and contain more information. Italso overcomes prior art limitations noted above by utilizing a powerfulcode to analyze the information contained in these spectra. For example,in an application in which one hundred different illuminationwavelengths are used, the acquired spectra contain as much as 100 timesthe information of the traditional single illumination wavelength Ramanspectrum. This provides an increase of specificity and a greaterresistance to interference from background clutter. The embodimentrequires no pre-enrichment or only minimal pre-enrichment.

A powerful multispectral analysis code such as IHPS, CHOMPS, or ENNanalyzes every acquired data point, examining details of the spectrathat could not be handled by traditional methods. Here, multispectral ismeant to indicate a number of wavelength dependent measurements greaterthan one. It is not meant to limit the number of such measurements inany way, although, in practice that number will typically be muchgreater than 1. Important features of multispectral processors are theirspeed, their ability to distinguish between spectral “fingerprints” thatcannot be reliably identified by conventional methods, the ability toidentify a signature of a small amount of an organism in a highbackground clutter, and the ability to store microorganisms' spectralsignatures in a built-in library.

The invention provides a new tool with which to study the protein orDNA/RNA markers of microorganisms and cells, as well as a tool torapidly identify and count organisms that cause diseases. The inventionalso provides a tool to identify compounds, including chemicals in thepresence of other chemicals. Specific applications include airmonitoring, water monitoring, monitoring during food production,monitoring during production of pharmaceuticals, rapid detection oftargeted disease organisms such as tuberculosis, pre and poststerilization monitoring, monitoring of allergens, identification offolded proteins (prions), and monitoring of blood constituents.

Additional features and advantages of the present invention will be setforth in, or be apparent from, the detailed description of preferredembodiments which follows.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic representation of a prior art resonance Raman UVdetection system.

FIGS. 2 a and b are spectral graphs of biological organisms resonantlyexcited by illumination of a sample with DUV light.

FIG. 3 is a schematic diagram of a spectroscopic detector according tothe invention.

FIG. 4 is a block diagram of an integrated DUV illuminator fordiode-pumped lasers according to the invention.

FIG. 5 is a schematic diagram of a monochromator type of spectrometeraccording to the invention.

FIGS. 6 a-c are graphs showing the performance of the detector employingthe CHOMPS processing algorithms in calculating the amount of a BCbacterium hidden among EC bacteria and random noise, for signal to noiseratios of 2, 37, and 450, respectively, according to the invention.

FIG. 7 is a graph showing an original measured spectrum and the spectrumplus a normally distributed noise for the BC-EC bacterial mixture ofFIGS. 6 a-c.

FIG. 8 is a graph showing the performance of the detector as in FIGS. 6a-c when only using one of the illuminating laser wavelengths accordingto the invention.

FIG. 9 is a graph showing the performance of the detector as in FIGS. 6a-c when using two of the illuminating wavelengths according to theinvention.

FIG. 10 is a graph showing the performance of the detector as in FIGS. 6a-c when using four of the illuminating wavelengths according to theinvention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Referring now to FIG. 3, a spectroscopic detector 100 includes anillumination source 102 for illuminating a sample 104 and therebyinducing the emission of radiation characteristic of the sample, aspectrum acquisition sensor or sensors 106 for sensing and capturing asspectral measurement data the response spectrum in the emittedradiation, and a processor 108 for processing and analyzing the spectralmeasurement data.

Source 102 includes a number of options, as follows:

Laser Illuminator (Type 1):

A single Gain Module, pumped by a pair of 40-W diode bars. Combined withthe second-harmonic unit, this can generate about 10 W of average powerat a 5-kHz rate. The Ti:sapphire laser pumped by this, tuned to 800 nm,produces an average power of 4 W.

The 700-960 nm UV light from the Ti:sapphire laser is tripled orquadrupled to the DUV 233-320 nm using BBO (Beta Barium Borate)crystals. This nonlinear material, beta-barium borate (BaB204), has alarge birefringence and UV transparency to 200 nm, allowing it to servemultiple roles as a doubler, tripler, or quadrupler of the fundamentalTi:sapphire wavelengths. For proper multiplication, as the Ti:sapphirewavelength is tuned, the angles for the BBO crystals must be adjustedaccordingly. An Inrad (Northvale, N.J.) Autotracker III automaticallyadjusts the angle of the BBO nonlinear crystal to maximize conversion asthe wavelength of the laser changes.

FIG. 4 presents a block diagram of an integrated DUV illuminator fordiode-pumped lasers that includes a Nd:YLF laser pump, power supply andsecond-harmonic generation blocks. The Ti:sapphire laser tuning isaccomplished by rotation of a three-plate birefringent filter driven bya computer-controlled stepper motor (not illustrated). Theharmonic-generation process for the Ti:sapphire laser uses two BBOcrystals mounted in Inrad Autotracker III units and generates UV averagepowers at the 10-100 mW level.

Laser Illuminator (Type 2)

Recent developments in optical-fiber technology make it possible toreplace the diode-pumped Nd:YLF laser with a fiber laser. For example,the fiber-amplifier output can be in the form of pulses, with sub-nsduration, at a pulse rate of several kHz and per-pulse energy ofhundreds of mJ, leading to an average power of more than 1 W. A fiberamplifier is highly efficient,. With high peak power pulses one candouble the fiber-laser output frequency with high efficiency usingnonlinear crystals.

There are several advantages in using the fiber-laser design:

No active cooling is needed for the fiber medium.

The diode-pump absorption band for the fiber is broad, relaxing anytemperature-control requirements for the diode pump lasers

The microchip laser is monolithic, eliminating any cavity mirrors thatcould go out of alignment.

The overall pump laser hardware occupies a small volume, allowingconstruction of a compact system.

A Ti:sapphire laser employs a short resonator and produces sub-nspulses. The threshold for the Ti:sapphire laser is 30 mJ, the slopeefficiency is 45% at the operating wavelength of 780 nm and the linewidth is 170 pm. The short pulses from this laser are ideal for drivingthe harmonic-generation process. A further reduction in line width isrequired to fall within the spectral acceptance of the nonlinearcrystals best suited for DUV generation. This is accomplished throughthe use of higher-finesse tuning elements.

Monochromator Illuminator

The invention may alternatively employ a monochromator for theillumination part of the system. A monochromator is a spectrometer whoseoutput is a narrow wavelength band selected from a source containing abroader range of wavelengths or a number of distinct wavelengths. Amonochromator can typically be scanned though large wavelength ranges byrotating a grating. FIG. 5 shows one common form of a monochromator 200called the Cherny-Turner monochromator. Light from a broad-band source202, such as a lamp, is focused by a lens 204 onto a variable-gap slit206 and reflected by a concave mirror 208 onto a rotating grating 210.The grating 210 selects a narrow wavelength band from the manywavelengths that the source 202 produces. A second mirror 212 relayslight at the selected wavelength band through a second slit 214 onto thebiological sample 216. Different wavelength bands can be selectedsequentially by rotating the grating.

Illuminator Made from Composite Single-wavelength Subsystems

Illumination at a few laser wavelengths can be provided by monochromaticsources, such as excimer lasers or solid state laser diodes beingdeveloped by DARPA. In this realization a plurality of such sources areplaced on a stage which rotates the laser into the optical path allowingthe sample to be illuminated sequentially by different wavelength laserlight. Alternatively, mirrors or a fiber optic switch may be used toselect the illuminating source.

Illuminator Made from a Filtered Broad-band Source

A broad-band source, such as a lamp, may be passed through a filter toilluminate the sample with a range of wavelengths much smaller than thatof the unfiltered source. The filter can be tuned, by changing itsorientation, for example, or different filters can be rotated into placeso as to provide for illuminating the sample at more than one wavelengthband.

In one implementation, source 102 is a laser operating at a pulse rateof 5-10 kHz and an adjustable wavelength of 233 to 320 nm and an averagepower of 10-100 mW. The fundamental approach for building such a laseris to generate 700-960 nm VIS-NIR laser light with a tunable solid statelaser, operating at 5-10 kHz, and to triple its output to the DUV233-320-nm using third-harmonic-generation crystals. In otherimplementations, source 102 could be one or more lamps, or diodes, asdiscussed above, each operating at a different wavelength or filtered tooutput specific wavelengths. In still another embodiment, source 102 canbe a wide spectrum lamp or diode whose output is sequentially adjustedusing a monochromator to produce a time sequence of narrow bandillumination.

Spectrum Detection

Emission spectrum detector 106 is selected for its sensitivity in thedesired spectral range of the emitted radiation, and may be aspectrometer and detector such as the model HR2000, manufactured byOcean Optics. In one embodiment spectrometers with rotatable gratingsare used to enable synchronization of the wavelength range beingmeasured with the wavelength of the illuminating system. Also, rotatablegratings allow adjustment in which the illuminating wavelength does notimpinge on the detector, thereby allowing the recording medium tomeasure the spectrum from the sample being detected without interferencefrom the illuminating wavelength light. In another embodiment, ascanning monochromator coupled to a single-point detector such as aphotomultiplier tube is used. Other suitable detectors 106 include oneor more diode or photoelectric tube detectors. Yet another detector 106is a filter, e.g. a narrow pass-band, tunable, holographic filter thatcan be used with a constant wavelength source 102.

Light from the sample can be coupled onto the spectrometer ormonochromator slit using fiber optics such as a DUV sapphire or quartzfiber. In another embodiment the light collection optics consistentirely or primarily of reflective elements, such as—for example—aCassegrain microscope, with cylindrical or parabolic focusing optics.The light delivery and signal collection component may also be ascanning mirror. A fundamental requirement for this component is thatthe organisms remain within the illuminating beam for the duration ofthe measurement.

Analysis

Processor 108, e.g. a microprocessor or PC programmed with the spectraldata processing algorithm discussed below, may also control the source102 and make decisions, such as dwell time at each frequency, requestfor resampling, sample cell control, sample cell ejection etc. A signalindicating a positive reading, species detected, and its concentrationmay be communicated onto a local screen or other display device or to aremote location via radio. The complete spectral data set can betransmitted upon request for re-examination by a human or for archivingat a remote location.

Processor 108 is programmed with a spectral data processing algorithmthat combines the plurality of spectral measurement data into acomposite spectrum by stringing the spectral measurement dataend-to-end. In this manner, the measurement becomes a single vectorwhere the components represent a response of the sample to illuminatedwavelength. For all intents and purposes, once combined the compositespectrum is treated as a single measurement of the sample. The compositespectrum therefore includes the spectral measurement data resulting fromthe emitted radiation of not just the substance of interest but of theadditional substances not of interest that may be present and that maytend to mask the signature of the substance of interest. No trainingdata and no reference or baseline are required, as a suitableclassification algorithm, e.g. principal components analysis (PCA),least squares fitting, or in a preferred embodiment a multispectralprocessor such as the Intelligent Hypersensor Processing System (IHPS),described in U.S. Pat. No. 6,038,344, Palmadesso et al., issued Mar. 14,2000, and incorporated herein by reference, or the Compression ofHyperdata with ORASIS Multisegment Pattern Sets (CHOMPS), described inU.S. Pat. No. 6,167,156, Antoniades et al., issued Dec. 26, 2000, andincorporated herein by reference, or the “Efficient Near Neighbor Search(ENN-Search) Method For High Dimensional Data Sets With Noise”,described in U.S. patent application Ser. No. 10/113,643, filed Mar. 29,2002, and incorporated herein by reference, process the compositespectrum.

IHPS processes the outputs of multiple sensors, which in the case of thepresent invention is the measured spectral data, and forms a series ofpattern vectors through the concatenation of the measured spectral data.The data is simultaneously sent to two separate processor “pipes”,although, in some applications the Adaptive Learning Module would be runfirst on a series of measured spectra and then the Demixer Module wouldbe run. The first is the Demixer Module, which decomposes each patternvector into a convex combination of a set of fundamental patterns whichare the constituents of the “mixture”. The decomposition is accomplishedusing projection operations termed “filter vectors” generated by thesecond processor pipe termed the Adaptive Learning Module. In thismanner, the signature pattern, containing and or representative of oneconstituent or a substance of interest, and which is masked by thepresence of other substances' emitted radiation, is separated from thelatter. A Library database of signatures of known substances ispreferably employed in order that the signatures of the other substancesare determined automatically by processor 108. Information detailing theidentification of a substance of interest, and optionally informationconcerning other substances as well, is output to a display device, e.g.a portable video monitor or LCD screen, alarm, and/or a communicationlink as may be convenient for the user and the intended application.

CHOMPS is a collection of algorithms designed to optimize the efficiencyof multispectral data processing systems. CHOMPS employs two types ofalgorithms, focused searching algorithms and compression packagingalgorithms. The focused algorithms reduce the computational burden ofthe prescreening process by reducing the number of comparisons needed todecide whether data is redundant, by selecting only those exemplarslikely to result in the exclusion of measured spectral data for theprescreener comparisons. The compression packaging algorithms employedby CHOMPS compress the volume of the data necessary to describe thesubstance of interest and the other substances present. These incombination with IHPS (or other suitable multispectral processors) leadto a compressed exemplar data set, although it should be understood thatthe compression is optional, that is, it may not be necessary for aselected application and is selectable based on the particular detectordesign desired and/or its intended application or end use.

The ENN search algorithm is based upon the CHOMPS algorithm, but it doesnot stop searching when a suitable match is found, but instead continueslooking for a better match. Accordingly, it extends CHOMPS whensearching the Library to find an even better match for the spectralsignature or signatures of interest. Additionally, the technical methodof the search is different than that used in CHOMPS.

The IHPS, CHOMPS, and ENN algorithms have methods for organizing dataand allowing fast comparisons that may be used as preprocessors or formanaging large libraries. All, or perhaps only some, of the basealgorithms are necessary. IHPS/CHOMPS/ENN contain a well developed setof algorithms designed for fast processing. They provide indications ofthe presence and estimates of the concentration of the sampleconstituents. The limits of the algorithms are determined by the physicsof the response of the system.

In applying a multispectral algorithm such as IHPS, CHOMPS, or ENN,there are two aspects to the processing of the data that is proposed tobe collected. The first is identification of the constituents of ameasured mixture. The second is the determination of the concentrationof the specific constituent. The power to make these determinationscomes from the large number of measurements that are made. Eachmeasurement of an emitted wavelength (for each illumination wavelength)gives a clue as to the makeup of the mixture. As the number ofmeasurements increase, with increasing number of illuminatingwavelengths, the power to make determinations of material identificationand concentration also increases. Additionally, the effects of noisedecrease with the increasing number of measurements.

The measured spectra of the mixtures exist in a large dimensional spacewhere the dimensionality can be as high as the number of measurements.However, the effective dimension (when noise is small or can be ignored)can be much lower—roughly the number of distinguishable constituents. Bymaking many measurements the likelihood of finding a dimension that canbe used to distinguish similar materials increases. The ability toincrease the “contrast” between similar items depends ultimately onphysical nature of the materials—there must be a sufficient differencesomewhere and it must be measurable.

CHOMPS is designed to analyze hyperspectral imagery. However, theconcepts readily transfer over to almost any spectroscopic technique.The general idea is that the mixtures measured are a linear convexcombination of constituent materials. This allows the processing to bebased on convex geometry. The linear model is sufficient in most caseseven when the data exhibits nonlinear effects such as the exponentialdecay associated with varying water depths.

CHOMPS' algorithms are generally run sequentially. The algorithmsprocess data from hyperspectral images and Raman spectroscopy images orpoint measurements. The spectra are considered as vectors of length n,where n is the number of wavelength channels of emitted light measured.Composite spectra are created by combining multiple spectra eachassociated with a different wavelength laser illumination. Of particularimportance are the filter vector algorithm and the prescreeneralgorithm. The algorithms are very fast and the time to process the datais not a limiting factor in the practice of the invention.

The filter vector algorithm is able to determine the concentrations of aparticular, known substance embedded in a complex background.Calculation of the filter vectors is very fast—a simple matrixinversion. The spectrum of a material is often referred to as an“endmember.” The filters are “matched” to the endmember being soughtwithin the current background. The filter vector approach is based on alinear least squares approach to determining the concentrations of theconstituent endmembers, and in the linear least squares sense it is anoptimal solution.

The prescreener algorithm can organize a very large spectral library inan optimal manner. Once this is done a measured spectrum, or anendmember, can be quickly matched to a library spectrum.

CHOMPS/IHPS/ENN can process data much faster that the instrument canmeasure them. As the number of measured spectra grow a goodrepresentation of the background space spanned by the spectra isdetermined, and the algorithm can differentiate the substance ofinterest, e.g. a dangerous pathogen, more easily. The algorithm can workentirely from library spectra, and thus do some level of identificationon a single measured spectrum, but that is likely dependent on havinginformation about the expected background material.

The substance of interest should have a significant effect on themeasured spectrum. A material that is sub-pixel covers less than theentire physical extent of the measurement area, with other substancesnot of interest accounting for the rest of the spectral information. Aconvolution of the spectral difference of the substance of interestcompared to the background and the coverage area of the substance withinthe measurement area determines the ability to determine the presence ofthe substance. This is akin to being able to see a bright orange coneagainst a black background compared to seeing something the same sizethat is dark grey.

In order to do identification as opposed to anomaly detection thealgorithm being used should be effective against subpixel targets.Certain algorithms used for hyperspectral analysis are of limited value.These include most statistical approaches used for classifying spectra.Such approaches depend on having some “ground truth” associated with thedata. The spectra are then classified based on their statisticalsimilarity to the ground truth. Methods such as Principle ComponentAnalysis (PCA) are widely used but are not substantially useful here, asthe analysis does not produce physically meaningful (constituent's)spectra, but instead produces directions in the data that maximize thevariance.

The class of algorithms to which CHOMPS, IHPS, and ENN belong providesimproved spectral data processing capabilities for the applicationsdescribed here. This class of algorithms follows the Linear MixtureModel (LMM); these handle subpixel mixing, produce physically meaningfulendmember spectra, and can produce estimations of the concentration ofeach endmember. Pixel Purity, available with the commercial packagecalled ENVI, manufactured by Research Systems Inc. is one suchalgorithm. Pixel Purity works by projecting the data against randomvectors. The purest spectrum of each material is more likely thanmixtures to be on the extreme ends of the projected data. By repeatedlyprojecting the data and counting the number of times it is extreme, themost pure spectra may be obtained. However, the number of projectionsneeded goes at least linearly as the amount of material in the pixelincreases. To find extremely subpixel material requires more projectionsthan is desirable. Even when the most pure endmembers (albeit notpurest) are found from within the data, the algorithm does not giveestimations of the pure endmembers as CHOMPS does. Additionally, it doesnot have any library handling ability as CHOMPS does.

NFINDR, an algorithm manufactured by TRA, Inc., and which is comparableto CHOMPS in performance and has about the same processing speed, worksby finding the largest volume simplex possible from the data. It,however, also experiences difficulties with extremely subpixel spectra.Also, it will not make an estimation of the endmember, but rather justuse that spectrum contained within the data. And like PP, NFINDR doesnot have any library handling ability.

CHOMPS acquires the resulting spectra emitted by the bio agent orsubstance of interest from a sequential or stepped scan through manydifferent excitation laser pulses as, perhaps, a 4-dimensionalhyperspectral cube whose labels are:

(1) Excitation wavelength

(2) Emitted wavelength

(3) Spatial dimension

(4) Time dimension

Into each of these labeled locations the emitted amplitude in inserted.All these measurements would compose a data set. This organization ofthe data is one possible way that might be used. Many others also exist.The signature spectra may then be compared as noted above to a similarlyconfigured Library of signature spectra to identify the presence of asubstance of interest. The Library in a preferred embodiment includessignature spectra distinguishing not just individual microorganisms butalso different species and subspecies, as well as individualcharacteristics such as the age of a microorganism, its growth medium,or other such environmental factors. For example, different spectralsignatures in the Library can include subspecies of E. Coli, E. Coligrown in different media, and E. Coli of different ages. Essentially,the Library should include measurements of any and all organisms thatare known and differentiable by the measurements. Any characteristics ofthe organisms that would result in a differentiable measured compositespectrum should be included. It is also conceivable for the system toidentify spectra as spectra that are not in the Library. This is done bynoting that endmembers have no close match in the library. These spectracould be later analyzed to determine if they indeed represent neworganisms and then added to the library for future use.

Spectra produced by the invention may also be used with a multispectralprocessor 108 such as IHPS in a non-scanning or limited wavelength mode,in that the illumination source 102 can be set to excite a sample at asmaller number of selected wavelengths, and due to the processingadvantage of IHPS, IHPS with CHOMPS, ENN, or another suitablemultispectral processor with or without CHOMPS, effectively identify asubstance of interest in the presence of other substances better thanprior art devices.

To test the demixing ability of CHOMPS, test composite spectra wereconstructed from individual spectra published in the literature. Thedata are of BC and EC (Bacillus cereus, E Coli) bacteria irradiated withlaser light at 222, 231, 242, and 251 nm at very different times. Thedata were re-sampled to a 5 cm⁻¹ spacing and to improve the resolvingability of CHOMPS the spectra taken at the 4 incident illuminationwavelengths were joined together to form a composite spectra (the lengthof which can be varied, e.g. up to 625 channels). Thus, theRaman-spectral signatures of BC and EC bacteria illuminated at 222, 231,242, and 251 nm were strung together to create a “four wavelengthfingerprint” of each type. The four-wavelength spectrum of one organismwas then added to the four-wavelength spectrum of the other (theclutter) in different ratios to simulate a small amount of organismpresent in a larger background clutter. The limit of detection isinfluenced by the extent to which the agent signal is masked by signalsfrom other organisms (clutter) and by random noise. The detector alsoperforms in the presence of clutter and random noise. Different amountsof random noise were added to determine an achievable level of noiseimmunity. CHOMPS was then applied to see whether, and to what extent itcan recover the correct amount of the test organism. The Signal to Noise(S/N) level is defined as the average of (signal/noise) in each channel.Mixtures were made on a sliding amount between a pure endmember to onewith 3×10⁻³. With the endmembers known the filter vectors werecalculated and applied to the mixtures. The results are shown in FIGS. 6a-c. On the y-axis is plotted 1 minus the difference between theexpected answer and that found by applying the filter vectors. On thex-axis is expected amount. If the retrieval worked perfectly, theexpected retrieved amounts would be the same and the y-axis would be 1.0for all values of the x-axis. However, note that as the amount of thematerial becomes small the retrieved amount differs from the expectedamount by an increasingly larger amount. The average S/N for the threegraphs is 2, 37, and 450, as labeled for FIGS. 6 a-c, respectively. FIG.7 shows the noise level for the mixture and the mixture with noiseadded, for a S/N of 37, that is, for an original measured spectrum andthe spectrum plus a normally distributed noise. FIG. 8 shows the resultsfound when only using one of the illuminating laser wavelengths. FIG. 9shows the result found when using two of the illuminating wavelengths.FIG. 10 shows the result found using all four of the illuminatingwavelengths.

The results demonstrate that the amount of BC organism hidden among ECorganisms in the ratio of 1/1000, equivalent to 100 agents among 100,000other organisms, was correctly determined (with an error of 15%) using 4illumination wavelengths with a signal to noise of 450. This isconsidered to be a very good result.

Good signal-to-noise ratios are important for lowering detectionthresholds. For 100 illumination wavelengths, one can identify an agenthidden in clutter in a ratio much smaller than 1/1000. This is because,with 100 illumination wavelengths, the stated results improve at aminimum of by the square root of 100/4, or a factor of 5. All else beingequal, one can expect to identify an organism hidden in clutter in theratio of 1/1000 for signal to noise ratios of 450 and 37 (accuracy of3%, 30%). The signal-to noise typical of resonant Raman is 10³-10⁴,which is 2 to 22 better than used in the calculation. With illuminationat 100 wavelengths not only does the influence of random noise decrease,but also the “contrast” or distinguishability between the organisms'patterns sharply increases. Other known mathematical techniques toreduce the effects of random noise mathematically may also be employed.

The detector 100 is well suited for continuous autonomous operation withoccasional down-time for maintenance only. The performance of the systemmay be monitored remotely and continuously by, for example, monitoringfor the presence and density of harmless species commonly found in theambient background. Illuminator performance may be monitored withstandard power meters and the results transmitted to the controlcomputer. Standard electric power is the only consumable required.

In the scanned wavelength implementation, the detector 100 canphysically scan through multiple wavelengths with a selected dwell time,e.g. of ˜1 second, at each illumination wavelength. Theprocessing/analysis such as with CHOMPS is fast, on the order of a fewmilliseconds. The apparatus can, therefore, complete an entire scan andanalyze the spectral results in just minutes, or less, depending on thenumber of wavelengths measured and the particular implementation. Also,as discussed above, the apparatus can also be optionally used to scan atjust a few or at one select frequency, if desired, and it should beunderstood that the number of scanned frequencies may be as high or aslow as the designer and/or experimenter may choose depending on theparticular application. The resolution of the spectral measurements mayalso be adjusted by balancing the conflicting requirements posed by theamount of photons the sample emits with the similarity of the spectrafrom the sample to be identified to spectra of other materials present.Higher resolution measurements require that the sample emit morephotons. Lower resolution spectra require that the sample spectrumdiffer from spectra of other materials at the measured resolution. Toidentify organisms whose spectra are presented in figures in thisapplication we expect to need a resolution of 1 to 50 wavenumbers(cm⁻¹). In other applications, or for other organisms, identificationmay possibly be made with much poorer spectral resolution—includingresolution so poor that the measurement can be made with filteredphotodetectors instead of with a spectrometer. In some applications thesample material to be identified may be chemically bound to anothersubstance to produce a new material whose spectral signature is uniqueand/or very different from the signatures of other substances that maybe present. In that case a measurement with poor spectral resolution maybe possible. Applications in which substances have similar spectra mayrequire better spectral resolution to detect subtle differences in thespectra.

The detector analyzes simultaneously for all the agents in its library.The number of organisms that the detector can distinguish is equal tothe number of distinct hyperspectral fingerprints that are produced byillumination at multiple ( e.g. 100) wavelengths. The four wavelengthsspectra of EC and BC bacteria discussed above, for example, were foundto differ from each other by 80 times the minimum that the CHOMPSalgorithms require for identification. The difference increases furtherwhen 100 wavelengths instead of four are used. In fact, if agentsdiffered from each other in spectra produced at one illuminationwavelength only, and were otherwise identical, the detector candistinguish 100 organisms. The number of false positives can besignificantly reduced by repeating a measurement, since the detectingmethod need not damage the organism in the sample.

As discussed above, the invention is able to identify a particularspecies within a genus of a microbiological organism, e.g. todistinguish a pathogenic E-coli 157 from other harmless E-coli spectrafingerprints. Features of agent spectral fingerprints will vary based onthe organism's stage of development and growth history. However, thereis good evidence from traditional resonance Raman studies that spectralsignatures from certain unique proteins/markers remain unaffected by theorganism's growth stage of growth history. The detector can isolate thisinformation and use it to provide important epidemiological and forensicinformation. Additionally, the processing algorithm can be programmed toignore particular individual measurements within the composite spectrum.This may be done to facilitate the determination of particularcharacteristics of the sample.

Although the examples provided herein are directed to resonant or nearRaman data performed on bacteria, the apparatus can also identifyproteins, viruses, cells, and organic and non-organic chemicals.Accordingly, the detector 100 can identify Category A, B, and Cmicroorganisms or agents as defined by the Centers for Disease Control.Category A includes anthrax (Bacillus anthracis), botulism (Clostridiumbotulinum toxin), plague (Yersinia pestis), smallpox (variola major),tularemia (Francisella tularensis), and viral hemorrhagic fevers(filoviruses [e.g., Ebola, Marburg] and arenaviruses [e.g., Lassa,Machupo]). Category B includes brucellosis (Brucella species), epsilontoxin of Clostridium perfringens, food safety threats (e.g., Salmonellaspecies, Escherichia coli O157:H7, Shigella), glanders (Burkholderiamallei), melioidosis (Burkholderia pseudomallei), psittacosis (Chlamydiapsittaci), Q fever (Coxiella burnetii), ricin toxin from Ricinuscommunis (castor beans), staphylococcal enterotoxin B, typhus fever(Rickettsia prowazekii), viral encephalitis (alphaviruses [e.g.,Venezuelan equine encephalitis, eastern equine encephalitis, westernequine encephalitis]), and water safety threats (e.g., Vibrio cholerae,Cryptosporidium parvum). Category C includes emerging infectiousdiseases such as Nipah virus and hantavirus. Other possible sourcesinclude substances of human origin such as blood, feces, and spit insituations such as border control, hospital triage, and epidemics andgroup or mass infections of the civilian and/or military sectors. Thesecan be identified in the environment at fixed sites, e.g. as a fixedinstallation of a detector 100 to monitor the air, water, containers,mail, AC/heating ducts, arenas, airport screening/security, and varioussurfaces, as well as in mobile or transient applications, e.g. as ahuman-portable or hand-held implementation of detector 100 or onemounted in a land vehicle or in an air or sea platform/vehicle. Detector100 may further be used to verify the effectiveness of monitoring ofdecontamination procedures and processes.

The mathematical upper limit of distinguishability is that the compositespectra differ from each other in at least one measurement. Eachcomposite spectrum consists of about 1000 measurements/illuminationwavelength*100 illumination wavelengths=10⁵ measurement. Thus,mathematically, in this example more than to 10⁵ agents can bedistinguished, so that the mathematics of the detector are not thelimiting factor for distinguishability.

The preferred detector embodiment uses no consumables and requires nopre-enrichment. The detector output lends itself naturally to wirelesstransmission. A wireless communication package is straight-forward toattach to the detector. Analyzed detector output can be presented interms of the type and numbers of organisms detected combined with analarm if any of these are harmful. Raw data can also be transmitted to aremote site for further analysis by specialists or for archiving.Triggers, external commands, diagnostic signals, etc. are likewisestraight-forward to implement.

The apparatus can be a modular system, with the illuminator, analysisand communications packages, and the sampling unit separated by manymeters. All the modules combined could occupy a volume of <3 cubic feet.No special logistical or environmental requirements exist.

Obviously many modifications and variations of the present invention arepossible in the light of the above teachings. It is therefore to beunderstood that the scope of the invention should be determined byreferring to the following appended claims.

1. A method of identifying the presence of a first substance in thepresence of at least one other substance, comprising: illuminating saidfirst substance and said at least one other substance withelectromagnetic radiation of one or more wavelengths to thereby inducethe emission of radiation characteristic of the substances beingilluminated; measuring the emitted radiation to obtain a plurality ofspectral measurement data; and inputting said spectral measurement datainto a processor, said processor including a processing algorithmconfigured for: combining the plurality of spectral measurement datainto a composite spectrum; applying said algorithm to said compositespectrum whereby at least one parameter characteristic of said firstsubstance is identified while information in said composite spectrumcontributed by an emission of radiation from said at least one othersubstance is removed to thereby identify the presence of said firstsubstance, and thereby obtaining a characteristic spectral signature ofsaid first substance; and comparing the spectral signature of said firstsubstance to spectral signatures of substances in a spectral librarydatabase, wherein at least some of said library spectral signatures havespectral characteristics differentiated from each other by identifiablespectral characteristics caused by one or more environmental factors. 2.A method as in claim 1, wherein the one or more environmental factorsinclude a growth medium.
 3. A method as in claim 1, wherein the one ormore environmental factors include a stage of development of a substanceof interest.
 4. A method as in claim 1, wherein the one or moreenvironmental factors include a stage of development of a substance notof interest.
 5. A method as in claim 1, wherein the plurality ofspectral measurement data comprises multiple measurements of emittedradiation at a resolution lower than 0.1 cm⁻¹.
 6. A method as in claim1, wherein the processing algorithm includes a multispectral dataprocessing algorithm for analyzing multispectral data.
 7. A method as inclaim 1, wherein the illuminating is with radiation in the ultravioletwavelength range.
 8. A method of identifying the presence of a firstsubstance in the presence of at least one other substance, comprising:illuminating said first substance and said at least one other substancewith electromagnetic radiation of one or more different wavelengths tothereby induce the emission of radiation characteristic of the substancebeing illuminated; measuring the emitted radiation to obtain a pluralityof spectral measurement data; inputting said spectral measurement datainto a processor, said processor including a multispectral dataprocessing algorithm; applying said algorithm to said plurality ofspectral measurement data whereby at least one parameter characteristicof said first substance is identified while information contributed byan emission of radiation from said at least one other substance isremoved to thereby identify the presence of said first substance, andthereby obtaining a characteristic spectral signature of said firstsubstance; and comparing the spectral signature of said first substanceto spectral signatures of substances in a spectral library database,wherein at least some of said library spectral signatures have spectralcharacteristics differentiated from each other by identifiable spectralcharacteristics caused by one or more environmental factors.
 9. A methodas in claim 8, wherein the one or more environmental factors include agrowth medium.
 10. A method as in claim 8, wherein the one or moreenvironmental factors include a stage of development of a substance ofinterest.
 11. A method as in claim 8, wherein the one or moreenvironmental factors include a stage of development of a substance notof interest.
 12. An apparatus for identifying the presence of a firstsubstance in the presence of at least one other substance, comprising:means for illuminating said first substance and said at least one othersubstance with electromagnetic radiation at a plurality of wavelengthsto thereby induce the emission of radiation characteristic of thesubstance; means for measuring the emitted radiation at a plurality ofemission wavelengths to obtain a plurality of spectral measurement data;and a processor for processing said spectral measurement data, saidprocessor including a multispectral data processing algorithm for (i)identifying at least one parameter characteristic of said firstsubstance while removing information in said spectral data contributedby an emission of radiation from said at least one other substance tothereby identify the presence of said first substance, and therebyobtaining a characteristic spectral signature of said first substance,and (ii) comparing the spectral signature of said first substance tospectral signatures of substances in a spectral library database,wherein at least some of said library spectral signatures have spectralcharacteristics differentiated from each other by identifiable spectralcharacteristics caused by one or more environmental factors.
 13. Anapparatus as in claim 12, wherein the one or more environmental factorsinclude a growth medium.
 14. An apparatus as in claim 12, wherein theone or more environmental factors include a stage of development of asubstance of interest.
 15. An apparatus as in claim 12, wherein the oneor more environmental factors include a stage of development of asubstance not of interest.
 16. An apparatus as in claim 12, wherein theilluminating is with radiation in the ultraviolet wavelength range. 17.An apparatus for identifying the presence of a first substance in thepresence of at least one other substance, comprising: a laser forilluminating said first substance and said at least one other substancewith electromagnetic radiation at a plurality of wavelengths to therebyinduce the emission of radiation characteristic of the substance; meansfor measuring the emitted radiation at a plurality of emissionwavelengths to obtain a plurality of spectral measurement data; and aprocessor for processing said spectral measurement data, said processorincluding a processing algorithm configured for: combining the pluralityof spectral measurement data into a composite spectrum; applying saidalgorithm to said composite spectrum whereby at least one parametercharacteristic of said first substance is identified while informationin said composite spectrum contributed by an emission of radiation fromsaid at least one other substance is removed to thereby identify thepresence of said first substance, and thereby obtain a characteristicspectral signature of said first substance; and comparing the spectralsignature of said first substance to spectral signatures of substancesin a spectral library database, wherein at least some of said libraryspectral signatures have spectral characteristics differentiated fromeach other by identifiable spectral characteristics caused by one ormore environmental factors.
 18. An apparatus as in claim 17, wherein theone or more environmental factors include a growth medium.
 19. Anapparatus as in claim 17, wherein the one or more environmental factorsinclude a stage of development of a substance of interest.
 20. Anapparatus as in claim 17, wherein the one or more environmental factorsinclude a stage of development of a substance not of interest.
 21. Anapparatus as in claim 17, wherein the plurality of spectral measurementdata comprises multiple measurements of emitted radiation at aresolution of lower than 0.1 cm⁻¹.
 22. An apparatus as in claim 17,wherein the processing algorithm includes a multispectral dataprocessing algorithm for generating multispectral data.
 23. An apparatusas in claim 17, wherein the illuminating is with radiation in theultraviolet wavelength range.